Multiple Action Detection


Generating Vision-Language Navigation Instructions Incorporated Fine-Grained Alignment Annotations

Add code
Jun 10, 2025
Viaarxiv icon

BacktrackAgent: Enhancing GUI Agent with Error Detection and Backtracking Mechanism

Add code
May 27, 2025
Viaarxiv icon

Saliency-guided Emotion Modeling: Predicting Viewer Reactions from Video Stimuli

Add code
May 25, 2025
Viaarxiv icon

Safe Uncertainty-Aware Learning of Robotic Suturing

Add code
May 22, 2025
Viaarxiv icon

ImgEdit: A Unified Image Editing Dataset and Benchmark

Add code
May 26, 2025
Viaarxiv icon

Scalable Video-to-Dataset Generation for Cross-Platform Mobile Agents

Add code
May 19, 2025
Viaarxiv icon

Automated Real-time Assessment of Intracranial Hemorrhage Detection AI Using an Ensembled Monitoring Model (EMM)

Add code
May 16, 2025
Viaarxiv icon

H2R: A Human-to-Robot Data Augmentation for Robot Pre-training from Videos

Add code
May 17, 2025
Viaarxiv icon

StoryReasoning Dataset: Using Chain-of-Thought for Scene Understanding and Grounded Story Generation

Add code
May 15, 2025
Viaarxiv icon

Action Spotting and Precise Event Detection in Sports: Datasets, Methods, and Challenges

Add code
May 06, 2025
Viaarxiv icon